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Abstract: Similar to biological evolution and speciation we define a lan- 
guage through a string of 8 or 16 bits. The parent gives its language to its 
children, apart from a random mutation from zero to one or from one to 
zero; initially all bits are zero. The Verhulst deaths are taken as propor- 
tional to the total number of people, while in addition languages spoken by 
many people are preferred over small languages. For a fixed population size, 
a sharp phase transition is observed: For low mutation rates, one language 
contains nearly all people; for high mutation rates, no language dominates 
and the size distribution of languages is roughly log-normal as for present 
human languages. A simple scaling law is valid. 
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1 Introduction 

Human languages are grouped into families, like the Indo-European lan- 
guages, which may all have arisen from one common original language. For 
example, ancient Latin split into Portuguese, Spanish, French, Italian, Ro- 
manian and other languages during the last two millenia. On the other hand, 
many of the present languages are spoken only by a relatively small num- 
ber of people and are in danger of extinction [H In this way languages 
are similar to biological species. We thus try to simulate languages using 
methods similar to the modelling of speciation 011]. 

A language for us can be a human language (including Fortran, ...), a sign 
language, a system of bird songs, a human alphabet, or any other system of 
communication. We simulate it by a string of 8, 16 or 30 bits and define 
languages as different if they differ in at least one bit. The position of the 
bit in the string plays no role, in contrast to the Penna ageing model from 
which program elements are taken [5]. 



2 Model 



We start with one person, i.e. N{t = 0) = 1, speaking language zero (all 
bits are zero). Then at each iteration t all N{t) living people are subject to 
a Verhulst death, i.e. they die with probability N{t)/K where K in biology 
is often called the carrying capacity and incorporates the limitations of food 
and space. Each survivor produces one offspring at each iteration which uses 
the same bitstring apart from one random mutation (bit changed from to 1 
or from 1 to 0) which happens with a probability p per person (or p/8 per bit 
if the language has 8 bits). Usually, all bit-strings are assumed to be equally 
fit, in contrast to typical biological models OEl- 

Also at each iteration, each individual can switch from its present lan- 
guage to another randomly selected one, with probability 

{2N{t)/K){l-x') 

where x is the fraction of all people speaking the present language of that 
individual. The first factor, which approaches unity for long times, ensures 
that at the beginning with a low population density there is not yet much 
competition between languages, while in the later stationary high population 
the less spoken languages are in danger of extinction. The exponent two takes 
into account that normally two people communicate with each other; thus 
the survival probability of a language is proportional to the square of the 
number of people speaking it. 

(The final population is K/2 and not K since we determine the Verhulst 
probability y = N{t — 1)/K at the beginning of iteration t and leave it at that 
value for the whole iteration. The Verhulst deaths thus reduce the population 
by a factor I — y, and if each of the survivors has b offspring, the population 
is multiplied by another factor 1 + 6. For a stationary population, these two 
factors have to cancel: (1 — y)(l + 6) = 1, giving y = b/{l +b) = 1/2 for our 
choice 6 = 1.) 

3 Results 

For an eventual stationary population of ten million ait = 1000, as a function 
of increasing mutation rate p, a sharp transition was observed between a 
dominance regime at low and a smooth distribution at high mutation rates 
p, Fig.l: 
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i) For low p, one language, usually the one with all bits zero, contains 
nearly all individuals, and the mutant languages differing from the dominant 
one by one bit only contain most of the rest. This behaviour is hardly realistic 
except for alphabets. 

ii) For high mutation rates p, on the other hand, no language contains 
a large fraction of the population, and the distribution of language sizes 
(measured as the number of people speaking it) is roughly log-normal with 
higher statistics for small languages. This result agrees well with reality [2]. 

In Fig.l, part a shows the drastic difference between dominance (+) and 
smooth distribution (x, stars), part b the slow approach to a symmetric log- 
normal distribution with increasing mutation rate. (We bin the number 
of people speaking one language into powers of two, lumping together all 
languages spoken by 33 to 64 people, for example.) 

In the dominance regime i) of low p, the number L{t) of languages first 
increases from unity towards about 10^ and then decreases again to about a 
dozen (not counting languages with less than 10 speakers). In the smooth 
regime ii) of high p the number L of languages first increases and then reaches 
a plateau, which may even equal the maximal number M = 2^ or M = 2^^ 
for 8 or 16 bits, respectively. 

Also for a fixed mutation rate as a function of the final population K/2 
we see a change from the dominance regime at low populations to a smooth 
distribution at high populations. Fig. 2. For very large populations a rather 
narrow distribution of language sizes develops, i.e. the whole population is 
distributed about equally among the surviving languages. Fig. 3 shows for an 
intermediate population a power law on the small-size side of the histogram, 
and a parabola-like curve, meaning a log-normal distribution in this log-log 
plot, for large language sizes. 

A simple scaling law, seen in Fig. 4, predicts the behaviour of the number 
L of languages as a function of the maximum possible number M of languages 
and the final population Noo — K/2: 

L/M = f{M/N^) . 

The scaling function f{z) equals unity for small z and decays as 1/z for large 
z. This means that for a population much larger than the possible number 
of languages, each language possibility is realized, while in the opposite limit 
each small group of individuals speaks its own language. Therefore we expect 
this simple scaling law to be valid also for longer bit-strings than the 8 and 
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16 bits simulated here. (32 bits allow for 4096 Mega languages, requiring too 
much computer memory in our program; 30 bits still worked.) 

We also modified the model to take into account the infiuence of a "su- 
perior" language on another, like the many words of French origin in the 
German language. With some probability g, at the moment of a mutation 
the new value of a bit is not the opposite of the old value (as done above) but 
is the value of the corresponding bit in the superior language. We define as 
superior language the bit-string having one everywhere except for a zero in 
the left-most position, i.e. 127 for 8 and 32767 for 16 bits. The larger q is (in 
the smooth regime of large p = 0.48 per individual), the higher is the frac- 
tion of samples ending with the superior language as the largest one. About 
half of the samples have the superior language as the numerically strongest 
one if g ~ 0.02 for 8 and 0.2 for 16 bits. If for 16 bits we take 127 instead 
of 32767 as the superior language, the results do not change much. (These 
probabilities hold for 10 million people and are appreciably larger, 0.05 and 
0.34, for one million.) 

4 Discussion 

Our model is more microscopic than the previous ones known to us [HI E] 
in that individuals are born, give birth, and die, instead of being lumped 
together into one differential equation. It also is more realistic since we 
allow for numerous languages instead of only two. For the latter choice, we 
would have to reduce our bit-string to a single bit, with M = 2 and thus 
M/N <C 1, corresponding to the left part of Fig.4. There we observe L = M, 
that means both languages survive. In jB] only one language survived since 
one was assumed to be superior compared to the other. We, on the other 
hand, regarded all languages as intrinsically equally fit, except for the last 
paragraph. 

We thank P.M.C. de Oliveira for suggesting to simulate languages, and 
the Julich supercomputer center for JUMP time. 
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Selected rates only: 1000 * mutations per bit = 14 (+), 16 (x), 30 (star) 
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10 M people, 1000 Iterations, 16 bits: 1 000*mutatlons per bit = 16, 18, ... 30 top to bottom 
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Figure 1: Histograms of language sizes for 16 bits, one sample only of 

K/2 = 10 million people, mutations per bit = 0.014 (+), 0.016 (x), 0.030 

(stars) in part a and 0.016 to 0.030 in steps of 0.002 in part b. 
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1 K, 1 K, 1 00 K, 1 M, 1 M people, from left to right, 1 6 bits, 0.03 mutations per bit 
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1 K, 2K, 5K, 10K, 100 K, 1M and 10M people from left to right, 8 bits, 0.06 mutations per bit 
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Figure 2: Histograms of languagze sizes for 16 bits (part a) and 8 bits (part 
b), with same mutation rate 0.48 per individual, for different population 
sizes, summed over up to 100 samples. 



2,500,000 people, 16 bits, 0.03 mutations per bit; and powerlaw size "3.4; sum over ten samples 
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Figure 3: Roughly log- normal size distribution, with higher values for small 
sizes described by a power law. 
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Scaling for population, number of languages, maximal number of languages; 1 6 (+) and 8 (x) bits 
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Figure 4: Scaling test: Symbols for 8 bits (x) and 16 bits (+) follow the 
same scaling function / if plotted as L/M versus M/Nao- Runs with 30 bits 
and 10 or 100 million people fit in reasonably near the lower right corner (not 
shown) . 
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